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1.  Description  of  Progress  _ / 

Figure  1  shows ®an  overview  of  the  SDC/NYU  natural  language  system.-  The  following  paragraphs  describe  the 


progress  during  the  first  quarter  associated  with  the  various  system  componefltd.  A  major 
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1.1.  Grammar 


(T 


The  grammar  is  being  extended  to  coverage  the  wider  range  of  constructions  found  in  the  CASREPs  material. 

Current  efforts  are  focussed  upon  both  the  BNF  and  restriction  components  of  the  grammar.  These  changes 
include: 

(l)  simplification  of  the  verbal  auxiliary  system,  with  a  corresponding  simplification  of  the  yes-no  question  rule;  (2)  a 
more  fine-grained  treatment  of  infinitival  complements,  which  should  permit  selection  restrictions  to  apply  more  accu¬ 
rately;  and  (3)  a  fuller  treatment  of  the  internal  structure  of  noun  and  adjective  phrases.  The  restriction  component 
of  the  grammar  is  currently  being  developed  to  accommodate  these  changes,  since  they  would  would  otherwise 
greatly  increase  the  number  of  parses  generated  by  the  grammar. 

In  addition,  a  conjunction  mechanism  has  been  added  to  handle  conjoining  both  within  noun  phrases  and  within 
assertions.  This  component  has  been  tested,  but  not  yet  integrated  into  the  semantic  component. 


1.2.  Noun  Phrase  Resolution 

The  noun  phrase  resolution  module  consists  of  two  components,  noun  phrase  semantics  and  reference  resolu¬ 
tion. 

Noun  phrase  semantics  is  called  by  the  parser  during  the  parse  of  a  sentence,  after  each  noun  phrase  has 
been  parsed.  It  performs  two  functions.  The  first  function  is  to  assist  the  parser  by  informing  it  whether  or  not 
the  parsed  noun  phrase  is  semantically  acceptable  in  the  current  domain.  For  example,  in  the  sentence,  field  engineer 
replaced  disk  drive  at  11/ £/ 0800,  ditk  drive  at  11/ £/ 0800  is  a  syntactically  acceptable  noun  phrase,  (as  in  parti¬ 
cipants  at  the  meeting).  However,  it  is  not  semantically  acceptable  in  that  at  11/ £/ 0800  is  intended  to  designate  the 
time  of  the  replacement,  not  a  property  of  the  disk  drive.  Noun  phrase  semantics  will  inform  the  parser  that 
the  noun  phrase  is  not  semantically  acceptable,  and  the  parser  can  then  look  for  another  parse.  In  order  for  this 
capability  to  be  fully  utilised,  however,  an  extensive  set  of  domain-specific  ruleB  about  semantic  acceptability  is 
required.  At  present  we  have  only  the  minimal  set  used  for  the  development  of  the  basic  mechanism.  [For  example,  in 
the  case  described  here,  at  11/ £/ 0800  is  excluded  as  a  modifier  for  disk  drive  by  a  rule  that  permits  only  the  name  of 
a  location  as  the  object  of  at  in  a  prepositional  phrase  modifying  a  noun  phrase.  In  the  form  of  a  Prolog  clause  this 
rule  can  be  stated  as  follows. 

*t(X,> 
isa2(X, location). 

The  second  function  of  noun  phrase  semantics  is  to  create  a  semantic  representation  of  the  noun  phrase,  which  will 
later  be  operated  on  by  reference  resolution.  For  example,  the  semantics  for  the  bad  disk  drive  would  be  represented 
by  the  following  Prolog  clauses. 

hasname(disk '  drive, X). 

bad(X). 

deffX).  %  that  is,  X  was  referred  to  with  a  full,  definite  noun  phrase, 

full_np(X).  %  rather  than  a  pronoun  or  indefinite  noun  phrase. 

The  second  component,  reference  resolution,  is  currently  integrated  with  clause  semantics.  The  functions  of 
clause  semantics  are  to  create  a  semantic  decomposition  for  the  verbs  in  a  text  and  to  instantiate  the  semantic 
roles  of  the  predicates  in  the  decomposition.  When  clause  semantics  is  ready  to  instantiate  a  semantic  role,  it 
calls  reference  resolution,  which  provides  a  referent,  e.g.,  [drivel],  as  the  filler  for  the  role.  Control  then  returns 
to  clause  semantics,  which  checks  the  selectional  restrictions  on  the  verb  to  determine  whether  or  not  the  proposed 
referent  is  semantically  appropriate.  If  it  is  not,  reference  resolution  is  called  again;  otherwise  clause  seman¬ 
tics  continues  instantiating  semantic  roles.  Reference  resolution  not  only  provides  referents,  but  also  performs 
other  pragmatic  functions,  such  as  associating  parts  with  what  they  are  part  of.  For  example,  something  referred 
to  as  the  motor  can  be  associated  with  a  previously  mentioned  disk  drive  that  it  is  part  of.  In  order  to  perform 
this  association,  reference  resolution  makes  heavy  use  of  the  knowledge  represented  in  the  domain  model.  A  ver¬ 
sion  of  Sidner’s  (1979)  focusing  algorithm  is  used  for  this  function,  as  well  as  for  pronoun  resolution.  Recently 
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the  reference  resolution  module  has  been  augmented  with  the  capability  of  resolving  pronouns  that  refer  to 
events,  as  in  Field  engineer  replaced  ditk  drive.  It  fixed  the  tyitem.  In  this  example,  it  would  be  resolved  as  mean¬ 
ing  the  fe '«  replacement  of  the  ditk  drive. 

I.S.  Clause  Semantics 

Our  initial  work  was  done  on  a  Maintenance  Report  domain  consisting  of  SDC  maintenance  reports.  This 
corpus  allowed  us  to  begin  developing  the  Clause  Semantic  component  for  an  actual  maintenance  domain,  while  nego¬ 
tiations  were  being  carried  out  to  determine  the  actual  domain  for  the  deliverable  system. 

The  texts  analysed  are  actual  maintenance  reports  as  they  are  called  into  the  Burroughs  Telephone  Tracking 
System  by  field  engineers  and  typed  in  by  telephone  operators.  These  reports  give  information  about  a  customer  who 
has  a  problem,  specific  symptoms  of  the  problem,  any  actions  taken  by  the  field  engineer  to  try  to  correct  the  prob¬ 
lem,  and  the  success  or  failure  of  such  actions.  The  goal  of  the  text  analysis  is  to  automatically  generate  a  data  base 
of  maintenance  information  that  can  be  used  to  correlate  customers  to  problems,  problem  types  to  machines,  and  so 
on. 

The  first  step  in  building  a  domain  model  for  maintenance  reports  was  to  build  a  semantic  net  representation  of 
the  machine  involved.  The  machine  in  the  example  text  given  below  is  a  B4700.  The  possible  parts  of  a  B4700  and 
the  associated  properties  of  these  parts  can  be  represented  using  4  basic  predicates:  ayatem,  haapart,  hasprop, 
hasvalue.  For  example  the  system  itself  is  indicated  by  system(b4700).  The  main  components  of  the  system  - 
cpu,  powerjsupply,  disk,  printer,  peripherals,  etc.  -  are  indicated  by  haapart  relations,  such  as 
haspart(b4700,cpu),  haapart(b4700,power_aupply),  haapart(b4700,dlak)„etc.  These  parts  are  themselves 
divided  into  subparts  which  are  also  indicated  with  haapart  relations,  such  as  haapart(power_aupply,  con¬ 
verter).  Parts  can  have  properties  associated  with  them,  i.e.,  hasprop  (fuse, amp),  which  can  also  have  values,  i.e., 
hasvalue(amp,70).  This  method  of  representation  results  in  a  general  description  of  a  type  of  computer  system. 
Specific  machines  represent  instances  of  this  general  representation.  When  a  particular  report  is  being  processed, 
hasname  relations  are  created  to  associate  the  specific  computer  parts  being  mentioned  with  the  part  descriptions 
from  the  general  machine  representation.  So  a  particular  B4700  would  be  indicated  by  predicates  such  as  these: 
hanname(b4700,aysteml),  hasname(cpu,cpul),  hasname(power_supply,  power_aupplyl),  etc. 

To  represent  the  information  conveyed  in  the  report,  two  other  types  of  representation  are  needed  as  well, 
causal  events  and  script  events.  A  given  event  can  be  both  a  causal  event  and  a  script  event.  The  representation 
includes  a  name,  i.e.,  eventl,  a  predicate  decomposition  of  the  event,  and  a  time  argument.  The  time  can  be  explicit, 
such  as  03-14/1700,  or  relative,  as  in  before(event2).  The  causal  events  are  used  to  represent  the  underlying 
problems  in  the  computer  system.  For  instance,  a  failure  of  the  powerjsupply  can  cause  a  future  of  the  entire  sys¬ 
tem.  These  causal  chains  are  referred  to  explicitly  in  the  text,  as  in  Syitem  down  with  solid  power  failure.  The  causal 
events  are  produced  by  doing  a  semantic  analysis  of  the  sentence  fragments  of  the  report,  with  each  syntactic  frag¬ 
ment  generally  corresponding  to  a  causal  event.  The  semantic  processing  technique  used  is  Inference-Driven  Semantic 
Analysis.  During  the  first  quarter,  we  analysed  the  maintenance  texts  and  developed  semantic  rules  for  the  basic 
maintenance  verbs;  we  have  also  begun  to  investigate  the  use  of  Inference-Driven  Semantic  Analysis  techniques  for 
the  analysis  of  nominalised  verbs,  as  in  The  field  engineer  performed  the  alignment  of  the  ditk  vs.  The  field  engineer 
aligned  the  ditk. 

Script  events  represent  the  sequence  of  actions  for  a  maintenance  call.  The  report  begins  with  the  customer 
call,  so  this  is  the  first  script  event.  The  description  of  the  problem  is  usually  the  second  event  and  so  on.  The 
assignment  of  events  to  the  script  events  will  be  produced  after  the  text  has  been  analysed  and  the  causal  event 
representation  has  been  established.  Script  events  have  not  been  implemented  yet. 

1.4.  Facilities 

Most  of  the  work  that  we  have  done  on  the  Symbolics  machines  has  been  preparatory  for  software  development. 
Several  people  have  been  studying  the  machine  and  learning  how  to  incorporate  it  into  our  working  environment.  In 
addition,  the  porting  of  already-developed  software  from  our  Vax  has  already  begun. 

We  have  networked  together  the  Symbolics  machines  with  our  Vaxen.  Both  machines  can  copy  files  to  each 
other,  and  the  Symbolics  machines  can  use  the  Vaxen  as  a  file  server,  as  well  as  remotely  executing  commands  on  the 
Vax.  We  have  also  implemented  a  scheme  to  allow  the  Symbolics  machines  to  use  the  printers  connected  to  the  Vax 
for  hardcopy  and  screen  dumps.  A  lot  of  small  customisations  and  modifications  have  been  made  to  tailor  the 
machines  to  fit  our  own  tastes  for  a  friendly  development  environment. 
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We  bare  also  been  working  out  mechanisms  to  allow  several  people  to  work  on  the  same  core  programs  and 
maintain  individual  modifications,  as  well  as  incorporating  techniques  for  controlling  the  complexity  of  large  pro¬ 
grams.  A  reasonable  backup  system  has  also  been  investigated. 

The  people  who  have  been  accumulating  experience  working  on  the  Symbolics  machines  have  started  to  transfer 
their  knowledge  to  other  users.  Some  of  the  Natural  Language  System  has  been  successfully  ported  over  to  the  Sym¬ 
bolics  machine;  in  the  process,  we  have  uncovered  numerous  bugs  in  the  Symbolics  Prolog  implementation.  Current 
progress  is  hampered  by  lack  of  support  by  Symbolics  for  its  Prolog  implementation. 


2.  Change  In  Key  Personnel 

The  project  is  currently  fully  staffed  by  the  following  people  (approximate  percent  of  time  indicated  as  well). 


Lynette  Hirschman  Sr.  Staff  Research  Scientist  33% 

Martha  Palmer  Staff  Research  Scientist  17% 

Deborah  Dahl  Research  Scientist  60% 

Marcia  Linebarger  Research  Scientist  33% 

Karen  Wieckert  Research  Scientist  Assoc.  100% 

John  Dowding  Research  Scientist  Assoc.  100% 


2.  Summary  of  Substantive  Information  from  Meetings  and  Conferences 


2.1.  Professional  Computational  Linguistics  Meetings 


ACL  *85  (July  8-12,  University  of  Chicago) 

The  23rd  meeting  of  the  Association  of  Computational  Linguistics  was  attended  by  John  Dowding  and  Karen 
Wieckert.  Dowding  and  Wieckert  attended  tutorials  on  TT,  as  well  as  attending  sessions  on  ... 


IEEE  Logic  Programming  Symposium  (July  15-18,  Boston,  Mass.) 

This  Symposium  was  sponsered  by  the  IEEE  Technical  Committee  on  Computer  Languages,  and  was  attended 
by  John  Dowding  and  Lew  Norton.  Dowding  focused  on  sessions  relating  to  parsing  issues. 


2.2.  SDC/NYU  Meetings 


SDC/NYU  Meeting  #1  (February  21,  New  York  University) 

Lynette  Hirschman,  Martha  Palmer,  Deborah  Dahl,  John  Dowding  and  Karen  Wieckert  attended  a  meeting 
with  Grishman,  Ksiesuk  and  Nhan  at  NYU.  Topics  of  discussion  included  noun  phrase  semantics  (Dahl),  chart  pars¬ 
ing  (Gawron),  conjunction  parsing  (Hirschman),  frame  representation  (Ksiesyk),  and  clause  semantics  (Palmer). 


SDC/NYU  Meeting  #2  (March  22,  Paoll,  PA). 

Ralph  Grishman,  Mark  Gawron,  Tomass  Ksiesyk,  and  Ngo  Thanh  Nhan  came  to  Paoii  for  a  meeting.  Martha 
Palmer  gave  a  talk  on  clause  semantics,  John  Dowding  and  Karen  Wieckert  gave  a  talk  on  lexical  lookup  and  entry 
procedures,  Tomas  Ksiesyk  gave  a  talk  on  a  frame  representation  language,  and  Mark  Gawron  gave  a  talk  on  using 
lamdba  conversion  to  build  an  intermediate  semantic  representation. 
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8DC/NYU  Meeting  #2  (April  IS,  Pnoll,  PA) 

Staff  from  SDC  and  NYU  met  again  to  work  out  a  presentation  at  the  DARPA  Technology  Review  Panel  meet¬ 
ing  in  Hawaii.  We  also  pursued  discussions  on  lamdba  conversion,  the  relationship  between  frames  and  clause  seman¬ 
tics,  and  the  overall  organisation  of  the  system. 


SDC/NYU  Meeting  #4  (June  27  -  New  York  University) 

On  this  trip  John  Dowding  and  Marcia  Linebarger  from  SDC  went  to  NYU  specifically  to  see  Mark  Gawron  at 
NYU  to  discuss  parsing  issues. 

I.t.  DARPA  Meetings 

DARPA  Natural  Language  Technology  Review  Panel  Meeting  (May  1-2,  Hawaii) 

The  first  Natural  Language  Technology  Review  Panel  (NLTRP)  meeting  was  held  at  CINCPACFLT,  Pearl  Har¬ 
bor,  Hawaii.  This  meeting  included  the  major  natural  language  contracts  (BBN,  ISI,  SDC,  NYU),  as  well  as  Texas 
Instruments,  the  Expert  Systems  contract.  The  overall  structure  of  the  Battle  Management  Program  was  reviewed, 
and  we  then  focused  on  the  natural  language  components  and  their  integration.  At  this  meeting,  we  held  side  meet¬ 
ings  with  CINCPACFLT  personnel  and  various  DARPA  personnel  and  consultants  to  choose  a  domain  for  text  pro¬ 
cessing.  The  outcome  of  these  meetings  was  a  proposal  to  use  an  existing  collection  of  CASREP  texts  describing 
failures  of  the  starting  air  compressor.  The  drawback  to  this  material  was  its  very  narrow  focus,  which  made  its  rela¬ 
tion  to  the  broader  Force  Readiness  Expert  System  (FRESH)  less  clear;  however,  the  conclusion  was  that  it  would  be 
too  burdensome  to  model  any  broader  domain,  for  the  initial  system. 


DARPA  Meeting  of  Natural  Language  Contractors,  (May  22-22,  Roelyn,  VA) 

This  meeting  served  to  introduce  and  organise  the  seven  NL  contractors  into  two  groups:  SDC,  NYU  and  SRI 
for  text  processing;  BBN,  ISI,  U.  Mass  and  Penn  for  user  interfaces.  The  meeting  also  raised  questions  around 
integration  of  components  into  a  final  system  and  criteria  for  measuring  success  of  the  overal  system  and  its  com¬ 
ponents. 

2.4.  Symbolics  Lisp  User's  Group  (SLUG)  Meetings 


SLUG  Meeting  (June  2  &  4,-  San  Francisco,  CA) 

SLUG  '85  was  the  organisational  meeting  for  the  Symbolics  LISP  Users  Group,  and  was  attended  by  John 
Dowding  and  Karen  Wieckert.  Useful  information,  including  details  of  the  design  of  Symbolics  Prolog,  was 
obtained  both  from  Symbolics  officials,  and  other  experienced  Symbolics  users. 


MAD-SLUG  Meeting  (June  12,  Princeton  University) 

This  meeting  of  the  Mid-Atlantic  Division  of  SLUG  was  attended  by  John  Dowding,  Charles  Kosloff,  and  Chris 
Andrews.  These  quarterly  meetings  will  be  hosted  by  different  Universities  and  Corporations.  At  this  meeting,  there 
were  several  guest  speakers  from  Symbolics,  and  the  details  of  organising  a  library  of  free  software  were  worked  out. 


4.  Problems  Encountered  and/or  Anticipated 

The  major  problem  to  date  concerns  Prolog  for  the  Symbolics  Machine.  SDC  has  two  machines  installed:  one 
purchased  by  SDC  and  one  GFE.  At  the  moment,  only  the  SDC  machine  has  Prolog.  The  Government-furnished 
machine  has  no  Prolog,  due  to  licensing  problems  between  DARPA,  ISI  and  Symbolics.  In  addition,  the  Prolog  fur¬ 
nished  by  Symbolics  has  numerous  problems,  including  various  bugs,  lack  of  development  environment  and  debugging 
facilities,  and  inadequate  support.  SDC  is  committed  to  a  Prolog  environment  for  its  natural  language  work,  since  its 
present  large  system  has  been  written  in  Prolog.  Lack  of  an  adequate  Prolog  on  the  Symbolics  will  present  serious 
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problems  in  developing  the  system,  as  well  as  in  integrating  smoothly  with  the  Lisp  components  under  development 
at  NYU. 

t.  Action  Required  by  the  Government 

Timely  negotiation  of  the  Prolog  License  agreement  to  ISI  would  enable  SDC  to  use  the  Goverment-furnished 
Symbolics  machine. 

I.  Fiscal  Status 

(1)  Amount  currently  provided  on  contract: 

9339,728  (funded)  9683,105  (contract  value) 

(2)  Expenditures  and  commitments  to  date: 

9  86,540  (through  August  31,  1985) 

(3)  Funds  required  to  complete  work: 

9253,188  (Year  1)  9596,565  (Yrs.  1-2) 
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